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In re Application of: Ariel PELED et al. 
Serial No.: 10/815,764 
Filed: April 2, 2004 

Final Office Action Mailing Date: December 23, 2008 
In the claims: 

1. (Currently Amended) A method for detecting an information item 
within an information sequence obtained from a digital medium, said information 
item comprising any one of a specified set of prestored information items whose 
distribution it is desired to control, comprising: 

transforming each of said set of prestored information items whose 
distribution it is desired to control from a first representation format into a respective 
format facilitating fest -a first comparison , said first comparison being fast in relation 
to a second relatively slower textual comparison, in accordance with a predetermined 
transformation format, said predetermined transformation format being preservative 
of meaning; 

transforming said information sequence obtained from said digital 
medium, into said format facilitating said first relatively fast comparison in 
accordance with said transformation format; 

determining the presence of one or more of said prestored information 
items within said transformed information sequence, said determining comprising: 

comparing said information sequence with said information item in 
said format facilitating said relatively fast comparison; and 

tf -when a match is found between said formats facilitating said 
relatively fast comparison then carrying out a -said second relatively slower textual 
comparison between said respective prestored information item and said extracted 
information sequence , and 

when a match is found using said second relatively slower textual 
comparison, applying a policy to control distribution of said information sequence . 

2. (Original) A method according to claim 1, further comprising storing 
said representations in a database. 

3. (Original) A method according to claim 1, further comprising sorting 
said representations into a sorted list. 
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4. (Original) A method according to claim 3, wherein said sorting is in 
accordance with a tree sorting algorithm. 

5. (Original) A method according to claim 1, wherein said information 
item comprises a single word. 

6. (Original) A method according to claim 1, wherein said information 
item comprises a sequence of words. 

7. (Original) A method according to claim 1, wherein said information 
item comprises a delimited sequence of sub-items. 

8. (Original) A method according to claim 7, wherein each of said sub- 
items comprises a sequence of alphanumeric characters. 

9. (Original) A method according to claim 1, wherein a type of said 
information item comprises one of a group of types comprising: a word, a phrase, a 
number, a credit-card number, a social security number, a name, an address, an email 
address, and an account number. 

10. (Original) A method according to claim 1, wherein said information 
sequence is provided over a digital traffic channel. 

1 1 . (Original) A method according to claim 1 0, wherein said digital traffic 
channel comprises one of a group of channels comprising: email, instant messaging, 
peer-to-peer network, fax, and a local area network. 

12. (Original) A method according to claim 1, wherein said information 
sequence comprises the body of an email 

13. (Original) A method according to claim 1, wherein said information 
sequence comprises an email attachment. 
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14. (Original) A method according to claim 1, further comprising 
retrieving said information sequence from a digital storage medium. 

15. (Previously Presented) A method according to claim 14, wherein said 
digital storage medium comprises a digital cache memory. 

16. (Original) A method according to claim 1, wherein said representation 
depends only on the textual and numeric content of the information item. 

17. (Previously Presented) A method according to claim 1, wherein said 
transforming into a format that facilitates fast comparison comprises Unicode 
encoding. 

18. (Previously Presented) A method according to claim 1, wherein said 
transforming into a format that facilitates fast comparison comprises converting all 
characters to upper-case characters or to lower-case characters. 

19. (Previously Presented) A method according to claim 1, wherein said 
transforming into a format that facilitates fast comparison comprises encoding an 
information item into a numeric representation. 

20. (Previously Presented) A method according to claim 1, wherein said 
transforming into a format that facilitates fast comparison comprises applying a first 
hashing function to said representations. 

21. (Original) A method according to claim 1, wherein said information 
sequence comprises sub-sequences. 



22. (Original) A method according to claim 21, wherein said sub- 
sequences are separated by delimiters. 
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23. (Original) A method according to claim 22 wherein said sub- 
sequences separated by delimiters are any of: words; names, and numbers. 

24. (Original) A method according to claim 23, further comprising 
scanning said information sequence to identify said sub-sequences. 

25. (Original) A method according to claim 24, and said determining is 
performed by matching said information item to an ordered series of said sub- 
sequences. 

26. (Cancelled) 

27. (Currently Amended) A method according to claim 26_1, wherein said 
policy is a security policy, said security policy comprises at least one of the 
following group of security policies: blocking said transmission, logging a record of 
said detection and detection details, and reporting said detection and detection 
details. 

28. (Currently Amended) A method according to claim 2627, wherein 
said information items are divided into sets, and wherein said security policy depends 
on the number of detected information items that belong to the same set. 

29. (Original) A method according to claim 28 wherein each of said sets 
comprises information items associated with a single individual. 

30. (Original) A method according to claim 1, wherein said information 
item comprises a sequence of sub-items. 



31. (Original) A method according to claim 30, wherein said sub-items 
are separated by delimiters. 
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32. (Original) A method according to claim 30, wherein a sub-item 
comprises one of a group comprising: a word, a number, and a character string. 

33. (Original) A method according to claim 30, wherein said determining 
comprises using a state machine operable to detect said sequence of delimited sub- 
items within said information sequence. 

34. (Previously Presented) A method according to claim 30, wherein said 
transforming into a format facilitating fast comparison comprises: 

applying a first hashing function to assign a respective preliminary 
hash value to each sub-item within said information item; and 

applying a second hashing function to assigning a global hash value to 
said information item based on said preliminary hash values of said sub-items. 

35. (Original) A method according to claim 34, wherein said information 
sequence comprises sub-sequences, and wherein said determining comprises: 

applying said first hashing function to assign a respective preliminary 
hash value to each of said sub-sequences; 

applying said second hashing function to at least one of said 
preliminary hash values to assign a global hash value to said at least one of said sub- 
sequences; and 

comparing said global hash value to hash values of said sub- 
sequences. 

36. (Original) A method according to claim 35, wherein said sub- 
sequences comprise one of a group comprising: a word, a number, and a character 
string . 

37. (Previously Presented) A method according to claim 35, wherein said 
sub-sequences comprise a plurality of ordered combinations of sub-sequences within 
said data sequence. 
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38. (Previously Presented) A method according to claim 36, wherein said 
sub-sequences comprise a plurality of combinations of sub-sequences within said 
data sequence. 

39. (Original) A method according to claim 38, wherein said second hash 
function is invariant to reordering of at least two of said sub-sequences. 

40. (Previously Presented) A method according to claim 39, further 
comprising checking whether a delimited segment was previously stored, and 
continuing said detection process only if a current delimited segment was previously 
stored. 

41-48 (Cancelled) 

49. (Currently Amended) An apparatus for detecting a predefined 
information item within a new information sequence for distribution control , said 
information item being any one of a specified set of data items, comprising: 

a preprocessor, for transforming said predefined information item into 
a canonical representation said transformation being preservative of meaning, in 
accordance with a canonical transformation format; and 

a scanner, for scanning said new information sequence to identify sub- 
sequences therewithin; and 

a comparator associated with said preprocessor and said scanner, for 
making a first relatively fast comparison involving comparing said canonical 
representation to said identified sub-sequences to make an initial determination of the 
presence of said specified information item within said information sequence, and 
wherever a match is found using said relatively fast comparison involving canonical 
representation, then fe ^-comparing original text wh e r e v e r said initial to make a 
second determination indicates of a match , the apparatus being configured to apply a 
policy for controlling distribution of said information sequence when said match is 
detected. 
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50. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, further comprising a user 
interface for inputting said information items. 

51. (Previously Presented) An apparatus for detecting a specified 
information item within an information sequence according to claim 49, wherein said 
scanner is further operable to transform said information sequence in accordance 
with said canonical transformation format. 

52. (Previously Presented) An apparatus for detecting a specified 
information item within an information sequence according to claim 49, wherein said 
scanner is further operable to transform said sub-sequences in accordance with said 
canonical transformation format. 

53. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, further comprising a database 
for storing a representation of each data item of said set. 

54. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, wherein said information 
sequence is obtained from a digital medium. 

55. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, fiirther comprising a sorter, 
for forming a sorted list of the respective representations of set of data items. 

56. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, wherein a type of said 
information item comprises one of a group of types comprising: a word, a phrase, a 
number, a credit-card number, a social security number, a name, an address, an email 
address, and an account number. 
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57. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, wherein said information 
sequence is provided over a digital traffic channel. 

58. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, further comprising retrieving 
said information sequence from a digital storage medium. 

59. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 58, wherein said digital storage 
medium comprises digital storage medium within a proxy server. 

60. (Cancelled) 

61. (Original) An apparatus for detecting a specified information item 
within an information sequence according to claim 49, wherein said encoding 
function comprises a hashing function. 

62. (Original) A method according to claim 2, wherein said transforming 
said representation and storage of said information items comprises: 

a) assigning a hash value to each delimited segment within said 
information item; 

b) assigning a hash value for said information item based on said 
hashes assigned to delimited segments within said information item; 

c) storing said hash values evaluated in step a) and step b) above; 
and wherein detecting said information items within said digital 

medium comprises: 

d) assigning a hash value to each delimited segment within said 
digital medium utilizing the same hash function used in step a) above; 

e) assigning a hash value for sequences of delimited segments 
utilizing the same hash function used in step b) above, said sequences being of 
pluralities of possible numbers of delimited segments within said information items; 
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0 comparing the hashes values evaluated in step e) above with 
said hash values stored in step e) above. 



